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Abstract 

Recent work in traffic analysis has shown that traffic patterns leaked through side channels can be 
used to recover important semantic information. For instance, attackers can find out which website, or 
which page on a website, a user is accessing simply by monitoring the packet size distribution. We show 
that traffic analysis is even a greater threat to privacy than previously thought by introducing a new attack 
that can be carried out remotely. In particular, we show that, to perform traffic analysis, adversaries do 
not need to directly observe the traffic patterns. Instead, they can gain sufficient information by sending 
probes from a far-off vantage point that exploits a queuing side channel in routers. 

To demonstrate the threat of such remote traffic analysis, we study a remote website detection attack 
that works against home broadband users. Because the remotely observed traffic patterns are more noisy 
than those obtained using previous schemes based on direct local traffic monitoring, we take a dynamic 
time warping (DTW) based approach to detecting fingerprints from the same website. As a new twist 
on website fingerprinting, we consider a website detection attack, where the attacker aims to find out 
whether a user browses a particular web site, and its privacy implications. We show experimentally that, 
although the success of the attack is highly variable, depending on the target site, for some sites very low 
error rates. We also show how such website detection can be used to deanonymize message board users. 

1 Introduction 

Traffic analysis is the practice of inferring sensitive information from patterns of communication. Recent 
research has shown that traffic analysis applied to network communications can be used to compromise 
users' secrecy and privacy By using packet sizes, timings, and counts, it is possible to fingerprint websites 
visited over an encrypted tunnel [2,4, 11, 17], infer keystrokes sent over a secure interactive connection [27, 
34] and even detect phrases in VoIP sessions [31-33]. These attacks have been explored in the context of a 
local adversary who can observe the target traffic directly on a shared network link or can monitor a wireless 
network from a nearby vantage point [25]. 

We consider an alternate traffic analysis approach that is available to remote adversaries. We notice 
that it is possible to infer the state of a router's queue through the observed queueing delay of a probe 
packet. By sending frequent probes, the attacker can measure the dynamics of the queue and thus learn 
an approximation of the sizes, timings, and counts of packets arriving at the router. In the case of home 
broadband networks, in particular, DSL lines, the attacker can send probe packets from a geographically 
distant vantage point, located as far away as another country; the large gap between the bandwidth of the 
DSL line and the rest of the Internet path makes it possible to isolate the queueing delay of the "last-mile" 
hop from that experienced elsewhere. 

To demonstrate the feasibility of using remote traffic analysis for real malicious attack to learn sensitive 
information, we adapt the website fingerprinting attack [2, 1 1, 17], previously targeted at local victims, to a 
new scenario and introduce a remote website detection attack. Our attack can find out when a victim user 
under observation visits a particular target site without directly monitoring the user's traffic. This would 
allow, for example, a company to find out when its employees visit its competitors' sites from their home 
computers or deanonymize users of web boards. 

In our adaptation, we encountered two challenges: the information obtained through remote traffic anal- 
ysis is more noisy than in the local case, and there is no easily available training set from which to create 
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Figure 1 : Queueing side channel 



a fingerprint. To address the former problem, we improved on the previous inference methodology, which 
used the distribution of packet sizes and inter-arrival times, and developed a fingerprint detection technique 
that makes use of ordered packet size sequences and the dynamic time warping (DTW) distance metric. To 
create a training set, we designed a testbed that uses an emulated DSL link and a virtual execution environ- 
ment to replicate the victim's home environment. 

To evaluate our work, we sent probes to a home DSL line in the United States from a rented server in 
a data center near Montreal, Canada; we chose this set up to demonstrate the low cost and barrier to entry 
to conduct the attack. We then compared the probe results with profiles of website fetches generated in a 
virtual testbed at our university. We tested our attack on detecting each of a list of 1 000 popular websites. 
We found that detection performance was highly variable; however, for a significant fraction of sites, it was 
possible to obtain very low false-positive rates without incurring significant false negatives. We also found 
that there is some accuracy loss due to the discrepancies in the test and training environments (distant from 
each other) that we were not (yet) able to eliminate. If the training and test data are both collected from the 
same location, a much larger fraction of sites can be accurately detected with low error rates. We find that 
despite working with a much noisier information source than previous web fingerprinting work [2, 11, 17], 
our website detection attack nevertheless shows that remote traffic analysis is a serious threat to Internet 
privacy. 

The rest of the paper is organized as follows. We describe our approach to remote traffic analysis in §2. 
In §3 we describe our adaptation of previous website fingerprinting attack to remotely confirming user's 
browsing activities. We evaluate our website detection attack in §4. We then discuss further extensions and 
the limitations of our technique in §5 and present related work in §6, concluding in §7. 

2 Remote Traffic Analysis 

Traffic analysis attacks have been known to be effective for quite some time. And yet, for most Internet 
users, they represent a minor concern at best. Although a dedicated attacker could always intercept traffic 
by, say, bribing a rogue ISP employee, or tapping a switch box, he would run the risk of being caught and 
potentially incurring criminal charges. In any case, this level of effort seems justified only for highly sensi- 
tive material, rather than casual snooping; therefore, as long as sensitive data are protected by encryption or 
other techniques, a user may feel relatively safe. 

We show, however, that traffic analysis can be carried out at a significantly lower cost, and by attackers 
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who never come into physical proximity with the user. In fact, the attackers can launch their attacks from 
another state or country, as long as they have access to a well-provisioned Internet connection. This, in turn, 
is very easy to obtain due to the highly-competitive Internet hosting business sector: a virtual private server 
in a data center can cost as little as a few dollars a month. 1 We show that the attacker's traffic is very low 
rate, thus attackers do not need to incur high bandwidth costs. Furthermore, users who are being spied upon 
are unlikely to notice the small amount of performance overhead. Thus, anyone with a credit card 2 can carry 
out the attack and leave little trace. 

In this section, we describe our approach to re- 
mote traffic analysis. We first introduce the queue- 
ing side channel, which is the basis of the attack. 
Then we design an algorithm to recover users' traf- 
fic patterns from the information leaked through 
this side channel. 

2.1 Queuing Side Channel 

We consider the following scenario. Alice is a home 
user at Sometown USA, browsing a website via her 
DSL Internet connection. Her computer is con- 
nected to a broadband router, using a wireless or 
wired LAN connection. 3 The router is connected 
via a DSL line to a DSLAM 4 or similar device op- 
erated by her ISP, which is then (eventually) con- 
nected to the Internet. Unbeknownst to Alice, Bob, 
who is located in another state, or another country 
wishes to attack Alice's privacy. If Bob knows Al- 
ice's IP address (for example, if Alice visited a site 
hosted by Bob), he can use his computer to send a 
series of ICMP echo requests (pings) to the router in 
Alice's house and monitor the responses to compute 
the round-trip times (RTTs). One component of the 
RTTs is the queueing delay that the packets experi- 
ence at the DSLAM prior to being transmitted over 
the DSL line; thus the RTTs leak information about 
the DSLAM queue size. This leakage in turn re- 
veals traffic patterns pertaining to Alice's activities. 

Since the probe packets traverse many Internet 
links, and the queuing delays on Alice's DSL link 
are but one component of the RTT, the question is, 
how much information is leaked by this side chan- 
nel? Furthermore, can it be used to infer any in- 
formation about Alice's activities? To evaluate the 
potential of this attack, we carried out a test on a 

home DSL link located in the USA. In the test, Alice opens a Web page www.yahoo.com on her computer. 
Simultaneously, Bob in Canada sends a ping request every 10 ms to Alice's home router. Figure 2(a) depicts 
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Figure 2: Real traffic on a DSL vs. probe RTTs. Alice 
resides in some town in US. Bob is located in Monte- 
real, Canada. 



^ee, for example, www . vpslink . com (retrieved February 2011). 

2 Working stolen credit cards are an easily acquired commodity on the black market [10]. 

3 In some cases, Alice's computer might be connected to the DSL line directly. 

4 DSL access multiplexer 
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RTT (ms) RTT span between 95th-percentile and minimum (ms) 

(a) Empirical CDF of ping RTTs from a typical host. (b) Empirical CDF of 95th percentile minus minimum RTTs. 

Figure 3: Measurement of DSL Probe Variance 

Table 1: RTTs measured by pinging from worldwide advantage points (in ms) 



Node 


Mean 


StdDev 


planet 6 .cs.ucsb.edu 


66.590 


0.543 


pll . rcc . uottawa . ca 


42.936 


0.619 


pll . grid . kiae . ru 


153.205 


0.749 


planet lab 2 .c3sl.ufpr.br 


177.318 


0.868 


planet 2 . pnl . nitech . ac . jp 


197.043 


0.567 


pll . eng . monash .edu.au 


221.460 


1.784 


planet lab 1 . xeno . cl . cam . ac . uk 


319.752 


2.297 


planetlab2 . comp . nus . edu . sg 


291.193 


4.221 



the traffic pattern of Alice's download traffic. The height of each peak in the figure represents the total 
size of packets that are downloaded during each 10 ms interval. Figure 2(b) plots the RTTs of Bob's ping 
requests. We can see a visual correlation between the traffic pattern and observed RTTs; whenever there is 
a large peak in the user's traffic, the attacker observes a correspondingly large RTT. 

The correlation between Alice's traffic and Bob's observed probe RTTs can be explained as follows. The 
RTTs include both the queuing delay incurred on the DSL link and delays on intermediate routers, which 
sit between Bob's computer and Alice's router. The intermediate routers are typically well provisioned and 
are unlikely to experience congestion [1, 16]; furthermore, the intermediate links have high bandwidth and 
thus queueing delays will be small in all cases. We validate this using our own measurements in the next 
subsection. 

On the other hand, Alice's DSL link is, by far, the slowest link that both her traffic and Bob's probe are 
likely to traverse. The queues at Alice's router can grow to be quite long (in relative terms), due to TCP 
behaviors, which cause the www . yahoo . com server to send a batch of TCP packets at a fast rate. As most 
routers schedule packets in a First In First Out (FIFO) manner, this congestion will leads to large queuing 
delays of Bob's ping packets. We saw that the additional delay caused by Alice's incoming traffic could be 
as high as over 100 ms. Thus, Alice's traffic patterns are clearly visible in the RTTs. 
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2.2 Measurement of DSL Probe Variance 



To further confirm that the fluctuation of user's RTTs are primarily determined by the congestion at the 
DSL link, we conducted a small Internet measurement study. We harvested IP addresses from access logs 
of servers run by the authors. We noted that many DSL providers assign a DNS name containing "dsl" to 
customers. Using reverse DNS lookups, we were able to locate 918 potential DSL hosts. To determine 
each host's suitability for measurement, we first determine if it responds to ping requests. We then use 
traceroute to locate the hop before the target DSL host (e.g., the DSLAM). Next, we ensure this previous 
hop also responds to ping requests. Lastly, we measure the minimum RTT of several hundred ping probes. 
We exclude any host with a minimum RTT of greater than 100ms to bound the study to hosts in a wide 
geographical area around our Montreal, Canada probe server. Using this method, we identified 189 DSL 
hosts to measure. The measurement consists of sending ping probes every 2 ms for 30 seconds to the target 
DSL host and then to its previous hop. We collected these traces in a loop over a period of several hours. 

We found that, on average, the target host RTT was ~10 ms greater than the previous hop. We frequently 
observed the pattern in Figure 3(a) where the previous hop RTT was very stable and target RTT variations 
greater than 10 ms. We then measured the span between the 95th percentile and the minimum observation 
in each sample. Figure 3(b) shows the CDFs of this data for the each target DSL host and its previous hop 
from the measurement set. We observe that the previous hop RTT span shows more stability than the end 
host, confirming the one of the primary assumptions of our work. 

To confirm the feasibility of learning the target user's traffic remotely (from another city, or even another 
country), we examined the RTTs from advantage points across different locations. In this experiment, we 
used a series of PlanetLab nodes located in 8 countries to send frequent ping requests to a DSL IP address in 
US, while keeping the host behind the DSL idle; i.e., no background traffic went through the DSL link. The 
measured RTT standard deviations, as listed in Table 1, thus represented the background noises for learning 
target traffic patterns. Recall in the queuing side channel, the user's traffic patterns are leaked through the 
extra delays experienced by the probe ping packets. On a common home DSL link with download speed of 
3 Mbps, the delay induced by one packet with size of 1500 bytes is about 4 ms. This is much higher than 
the standard deviations of RTTs at the first 5 nodes. Therefore, the traffic patterns would survive well in 
the RTTs observed on those nodes. In fact, many of the patterns we observe involve multiple back-to-back 
packets, creating extra delay of tens or even hundreds of milliseconds. Such patterns would be detectable 
even on the worst of these links, although for best quality of attack, the attacker should pick a vantage point 
relatively close to the target (e.g., same continent). 

2.3 Traffic Pattern Recovery Algorithm 

We now show how the attacker can analyze the in- 
formation leaked through this queueing side chan- 
nel. We model the incoming DSL link as a FIFO 
queue. As most traffic volume in an HTTP ses- 
sion occurs on the download side, we will ignore the 
queuing behavior on the outgoing DSL link, though 
it could be modeled in a similar fashion. 

Figure 4 depicts the arrival and departure pro- 
cess in this queuing system. The arrows are Bob's 
ping packets, denoted by P^'s, and the blocks rep- 
resent HTTP packets downloaded by Alice. The 

DSLAM serves packets in FIFO manner and at a constant service rate; i.e., the service time is propor- 
tional to the packet size. As most HTTP packets are more than an order of magnitude larger than ping 
packets, we ignore the service time for pings. 
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Figure 4: FIFO queuing in the DSL router 
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Algorithm 1 Traffic pattern recovery algorithm 

l: leU' = 

2: for i — 1 to the probe sequence length do 

3: # reconstruct the arrival and departure times 

4: t{ — tpi n g • l 

5: t\ — RTTi — RTT min + ti 

6: # estimate the total size of packets arriving in [t^-i, U] 

7: Si = t - - max(t-_ l5 ^) 

8: # discard noise 

9: if 5^ < 77 then 
10: ^ = 

ll: end if 

12: end for 



Assume that ping packet arrives in the queue at time U, waits for the router to serve all the packets 
currently in the router, and then departs at time t\. Let us consider the observed RTT of the ping packet P^; 
we can represent it as: 

/G links on path 

where q\,p\, and t\ are the queueing, propagation, and transmission delays incurred by packet P^ on link I. 
Note that the propagation and transmission delays are mostly constant, and in fact we can approximate: 

t\+p\^mmRTTj (2) 

/G links on path 

since Bob is likely to experience near-zero queueing delays for some of the pings. Furthermore, as argued 
in §2.1, the queueing delay on links other than the DSL line are going to be minimal, thus we can further 
approximate: 

RTTi « min RTTj + (tj - U) (3) 

3 

Making use of the queuing delay t[ — ti from (3), the attacker Bob can further infer the total size of HTTP 
packets arriving during the interval [^_i,^]'s, which produces a similar pattern as Alice's traffic in Fig- 
ure 2(a). For this purpose, two cases need to be considered. 

1- U > t'i_x. In this case, when P^ enters the queue, the DSLAM is either idle or serving packets destined 
for Alice. The delay i! { — U reflects the time required to finish serving the HTTP packets currently 
in the buffer, and is thus approximately proportional to the total size of Alice's arrivals during the 
interval [t^-i, U]. P2 in Figure 4 is one example of this case. 

2. U < t'i_\. In this case, P^_i is still in the queue when P^ arrives. Only after P^_i departs at t\_^ the 
router can start to serve packets that arrived in the interval [t^-i, U]. Thus the delay t\ — is the 
service time for those packets and can be used to recover the total size. P4 in Figure 4 is one example 
of this case. 

Algorithm 1 summarizes the traffic pattern recovery procedure based on these observations. To account 
for minor queueing delays experienced on other links, we define a threshold 77 such that RTT variations 
smaller than 77 are considered noise and do not correspond to any packet arrival at the DSLAM. Figure 2(c) 
plots the pattern extracted from RTTs in Figure 2(b). After processing, the resulting time series proportion- 
ally approximate the packet size sequence of the original traffic in Figure 2(a). As will be shown in the next 
section, it can be applied to infer more information about Alice's activities, e.g., website fingerprinting. 
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Note that in case 1, the attacker may underestimate the size of the HTTP packets arriving in the period 
[U-i, U] because a portion of them will have already been serviced by time U. The error depends both on 
the frequency of the probes and the bandwidth of the DSL link. Since most HTTP packets are of max- 
imal size (MTU), we can ensure that all such packets are observed by setting the ping period to be less 
than: DSL ^™ width .Thus the adversary must tune the probe rate based on the DSL bandwidth and faster links 
will require a higher bandwidth overhead (but the pings will form a constant, small fraction of the overall 
DSL bandwidth.) 

2.4 Properties 

Our remote traffic analysis attack has the following properties that are different from traditional local traffic 
analysis techniques: 

Remote vantage point The attacker does not need to physically capture the target traffic flow. He can 
launch this attack from almost anywhere, even different states or counties from the target user. 

Low cost The attacker can perform this attack as long as he has access to a well-provisioned Internet con- 
nection. Moreover, the probe traffic has a very low rate, e.g., 50 Kbps for 100 pings per second. Thus 
the attacker does not need to incur considerable bandwidth costs, and victims are unlikely to notice 
the additional overhead. Additionally, due to such low cost, the attacker can possibly monitor multiple 
targets simultaneously. 

Coarser observation From the queueing side channel, the attacker only can obtain estimations about the 
sum of packet sizes arriving between successive pings and may miss some of the traffic, depending 
on the probe frequency. This observation is coarser than previous traffic analysis work, where local 
vantage points enable the attacker to gather information about every single packet, such as the exact 
size and inter-packet delays. Thus the performance of a remote traffic analysis attack will generally 
be worse than what is possible with local observations. We show that despite coarse observation we 
are still able to reconstruct an alarming amount of information from remote hosts using the attack. 

2.5 Feasibility 

We have shown that the attacker can recover the user's traffic pattern through the information leakage of the 
queuing side channel. We now address the feasibility of our attack by further discussing the prevalence of 
the conditions required for this attack. 

2.5.1 ICMP support 

The attack scenario we show above relies on ICMP probe packets, hence we care about whether ICMP 
is enabled in real routers. In testing over 918 probable DSL hosts on the Internet, we found over 25% 
responded to ping requests. Since we harvested these probable DSL hosts from the Internet over a period 
of several months, it is not clear how many that failed to respond were simply down rather than blocking 
our probes. Thus, we can assume that the fraction of hosts that respond to ping is even larger. Additionally, 
in a brief survey of consumer-grade router hardware, we found that many of them do not perform ICMP 
filtering, at least not in the default configuration. Moreover, even though the ping packets are blocked by 
firewalls on some home routers, other forms of probes may be exploited as well; for example, if the home 
router exposes TCP ports for file sharing or other applications, SYN packets can be used as probes with the 
same effectiveness. 

2.5.2 FIFO scheduling policy 

The high correlation between Alice's traffic pattern and Bob's ping RTTs comes from the fact that the router 
serves packets in FIFO order. Note that most home routers today do not use QoS extensions and schedule 
packets on a given link in FIFO order. Thus, information leaked by these routers can be exploited with 
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remote traffic analysis. Certainly, a fair queuing implementation [26] would reduce the impact that cross- 
traffic would have on the probe sequence and hence reduce the effectiveness of the side channel, but not 
entirely eliminate it [15]. 

2.5.3 Limited Last-hop Bandwidth 

The information leaked through our side channel are the states of the queue length in the router's buffer. 
Hence, to have nontrivial queues built up in the buffer, the broadband link must have limited bandwidth 
compared to the rest of the links in the path. In our experiments, we have used speeds typical of current home 
broadband speeds — several Mbps, and our scheme worked well in those environments. The deployment of 
faster links, such as Fiber-to-the-Home (FTTH), may reduce the effectiveness of the queueing side channel, 
but notice that if the core network is similarly upgraded in speed, the bandwidth disparity necessary for our 
attack will remain. 

2.5.4 Victim's IP address 

In our attack, Bob needs to know Alice's IP address to send the probes. Although this mapping is typically 
only explicitly known to ISPs, many protocols, such as file sharing, instant messaging, VoIP, and email, 
will reveal the IP address of a user. Other forms of IP address reconnaissance may also be possible but are 
outside the scope of this work. 

3 Website Fingerprinting 

Previous work on traffic analysis has shown that it is often possible to identify the website that someone is 
visiting based on traffic timings and packet sizes [2, 11, 17], namely, website fingerprinting. We consider 
whether it is possible to carry out a similar attack using our remote traffic analysis. We first review the three 
basic steps in previous work when conducting a website fingerprinting attack. 

1. First, the attacker decides some feature of web traffic used to distinguish websites. The feature needs 
to stay relatively stable for accesses to the same single website, but has significant diversity across 
different sites. For example, Herrmann et al. use the size distribution of HTTP packets [11]. 

2. The next step is the training procedure. The attacker needs a training data set of fingerprint samples 
labeled with corresponding destination websites. Usually, these feature profiles are obtained by the 
attacker browsing websites himself/herself from the same (or similar) network connection as the user. 

3. In the final step, the attacker tests his knowledge from training on the victim user. He monitors traffic 
going to the user and matches extracted features with the profiles in his database. The one with most 
similarity is chosen as the website browsed by the user. 

As compared with previous work, using our remote traffic analysis technique for identifying websites 
introduces two additional challenges. First, previous work used fine-grained information like exact packet 
size distributions to create features, whereas in our setting this information is not available directly, since the 
queueing side channel produces only approximate sums of packet sizes. Second, previous work created a 
training set from the same vantage point that was then used for fingerprinting tests. An attacker performing 
remote traffic analysis must, of course, use a different environment for collecting the training set, potentially 
affecting the measured features. We describe our approaches to solving these two challenges next. 

3.1 Time Series-Based Feature 

Since it is hard to infer information about each single packet from our recovered pattern time series, we 
use the entire time series, which contains the estimated size of all HTTP packets downloaded during each 
probe period, to create one fingerprint trace. Identification of websites is based on the similarity between 
the observed fingerprints and samples in the training set. 

The challenge is to find a meaningful distance between fingerprint traces. Note that pointwise com- 
parisons will produce poor results. This is because parts of the fingerprint may be impacted by the noise 
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from a small queueing delay on a core Internet link. Additionally, the fingerprint could miss some pack- 
ets contained in the original traffic due to pattern recovery errors. Finally, even fingerprints of the same 
website are not strictly synchronized in time due to the inter-packet delay variations. To deal with these 
issues, we turn to the Dynamic Time Warping (DTW) distance [ ]. DTW was developed for use in speech 
processing to account for the fact that when people speak, they pronounce various features of the phonemes 
at different speeds, and do not always enunciate all of the features. DTW attempts to find the best align- 
ment of two time series by creating a non-linear time warp between the sequences. Figure 5 visualizes the 
DTW-based distance between two time series: A = {a\,a2 . . . , a/} and B = {b\, &2 • • • 3 bj}. Let function 
F(c) = {c(l), . . . , c(K)} be a mapping from series A to series B where c{k) = (a(i), b(j)). For every pair 
of matched points based on the mapping, we define the distance as d(c(k)) = j) = \ai — b 3 ■]. The final 
distance between the A and B can then be defined as a weighted and normalized sum over all matched point 

pairs as D(A, B) = mini? { ^ k ^^ c ^ k ^^ i. The weights w(k) 9 s are flexible parameters picked based on 

I Z^fc=i w(k) ) 

the specific application scenario. Applying dynamic programming, one can find the warping function with 
minimum distance, which captures the similarity between the two time series under best matched alignment. 

In our attack, we applies DTW-based distance 
to account for the estimation errors and time desyn- B a 

chronizations in fingerprints. Based on the dis- 
tances with the training data set, the attacker will 
know if a test sample indicates the activity that the 
user browsed the website of interest. 
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Figure 5: Warping function in DTW 



3.2 Training Environment 

To obtain an accurate training fingerprint for a par- 
ticular user's traffic, the attacker must be able to 
replicate the network conditions on that user's home 
network. The approach we use is to set up a virtual 
machine running a browser that is connected to the 
Internet via a virtual Dummynet link [ ]. The vir- 
tual machine is then scripted to fetch a set of web 
pages of interest; at the same time, an outside probe 
is sent across the Dummynet link, simulating the at- 
tack conditions on a real DSL link. 

A number of parameters of the link need to be carefully decided. We found that the most important 
parameter for the attacker to replicate was the link bandwidth. First, as discussed in §2.3, the probe frequency 
should be adjusted based on the link bandwidth. Bandwidth also affects the magnitude of observed queuing 
delays. Additionally, it can significantly alter the traffic pattern itself, as TCP congestion control mechanisms 
are affected by the available bandwidth. Fortunately, estimating the bandwidth on a link is a well-studied 
problem [20, 22, 28]. In our tests, we use a packet-train technique by sending a burst of probe packets and 
measuring the rate at which responses are returned. Since most DSL lines have asymmetric bandwidth, 
we used TCP ACK packets with 1000 data bytes to measure the download bandwidth on the link. The 
target would send a short TCP reset packet for each ACK that it received, with the spacing between resets 
indicating the downstream bandwidth; we found this method to be fairly accurate. 

The round-trip time between the home router and the website hosts also affects the fingerprint. When 
opening a webpage, the browser can download objects from several host servers. The traffic pattern is the 
sum of all download connections, hence the shape of observed fingerprint does depend on the RTTs to these 
servers. However, we did not explicitly model this parameter considering the difficulty to accurately tune 
up the link delays to multiple destinations. The effects to the will be further discussed in §4. 
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The fingerprint may be affected by the choice of browsers and operating systems as well; for best result, 
the training environment should model the target as closely as possible. Information about browser and 
operating system versions can be easily obtained if the target can be convinced to visit a website run by the 
attacker; additionally, fingerprinting techniques in [18] may be used to recover some of this information. 

3.3 Attack Scenarios 

We consider several attack scenarios that make use of website fingerprinting. We can first consider the 
classic website fingerprinting scenario: Bob obtains traces from Alice's computer by sending probes to her 
DSL router and compares them to fingerprints of websites that he has generated, in order to learn about 
her browsing habits. Note that this can be seen as a classification task: each web request in Alice's trace 
is classified as belonging to a set of sites. This scenario has been used in most of the previous work on 
website fingerprinting, but it introduces the requirement that Bob must know the set of potential sites that 
Alice may visit. Without some prior information about Alice's browsing habits, this potential set includes 
every site on the Internet, making it infeasible to generate a comprehensive set of fingerprints. One could 
create fingerprints for popular sites only, but this reduces the accuracy of the classification task [6, 1 1, 29]. 
For example, the top 1 000 US sites, as tracked by Alexa, are responsible for only 56% of all page views, 
therefore, even a perfect classifier trained on the 1 000 sites would give the wrong result nearly half the 
time. 5 

We therefore consider a different scenario, where Bob wants to detect whether Alice visits a particular 
site. For example, if Bob is Alice's employer, he may wish to check to see if she is considering going to 
work for Bob's competitor, Carol. To carry out this attack, Bob would create a fingerprint for Carol's jobs 
site; he would then perform a binary classification task on Alice's traffic, trying to decide whether a trace 
represents a visit to the target site or some other site on the Internet. As we will see, such binary classification 
can be performed with relatively high accuracy for some choices of sites. Note that, as Alice's employer, 
Bob has plenty of opportunities to learn information about Alice's home network, such as her IP address, 
browser and operating system versions, and download bandwidth, by observing Alice when she connects to 
a password-protected Intranet site, and can therefore use this information to create accurate training data for 
building fingerprints. 

As another example, Bob may be trying to identify an employee who makes posts to a web message 
board critical of Bob. 6 Bob can similarly build profiles, tailored for each employee's home computer, of the 
web board and perform remote traffic analysis. He can then correlate any detected matches to the times of 
the posts by the offending pseudonym; note that this deanonymization attack is able to tolerate a significant 
number of false-positive and false-negative errors by combining observations over many days to improve 
confidence [7]. 

4 Evaluation 

We next present our results of website detection attack. First, we describe the experimental setups and data 
collection procedure. 

4.1 Experimental Setups 

We built a DSL-Setup consisting of a target system and a ping server, as shown in Figure 6(a). The target 
system captured the real environment of a home user. It ran on a laptop, located in our city (inside US), 
connected to DSL line with 3 Mbps download and 512 Kbps upload speeds. On the laptop, we used a shell 

5 In fact, the situation is even worse, since Alexa counts all page views within a certain top-level domain, whereas fingerprints 
must be created on each individual URL. 

6 This example is motivated by several actual cases of companies seeking to do this; see https : / /www .eff.org/cases/ 
usa-technologies-v- stokklerk and https : //www. ef f . org/ cases /first- cash- v- john-doe. 
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(a) The DSL-Setup for test (b) The VM-Setup for training 



Figure 6: Experimental setups for website detection 

script to automatically load websites using Firefox 4.0 7 . The ping server was a commercial hosting system, 
located in the Canadian province of Quebec, acting as the remote attacker. It was scripted to send pings at 
precise time intervals with hping 8 and record ping traces with tcpdump 9 . We set the ping interval to 2 ms. 

To emulate the attacker's training procedure, we also built a VM-Setup, a VMware ESX host testbed 
located in our lab, as shown in Figure 6(b). On this machine, we ran several VMware guest operating 
systems: a Ubuntu VM Client, a virtual router and a host implementing a transparent Dummynet link. The 
Ubuntu VM Client acted as a virtual target, and was scripted to browse websites using Firefox, similar to 
the real home user. The virtual router provided NAT service for the client, and was connected to the Internet 
through the Dummynet link. The Dummynet bridge was configured to replicate the network conditions of 
the target DSL link (i.e., the bandwidths). As in the DSL-Setup, we sent probes from another host outside 
the constrained Dummynet link to the virtual NAT router periodically. The attacker then collected training 
fingerprints while the virtual client was browsing websites through this virtual 'DSL' link. Note the virtual 
router and ping host were connected to the same dedicated high-speed LAN minimizing the impact of 
additional noise added by intermediate routers or network congestion caused by other hosts. 

4.2 Data Collection 

We collected fingerprints of the front pages for 1000 websites on the top list on Alexa 10 . For websites which 
have multiple mirrors in different countries like www.google.com, we only considered the site with the 
highest rank. We excluded websites with extremely large loading time (greater than 60 s). For each website, 
we collected 12 fingerprint samples from both the DSL and VM setups. The delay between collecting two 
samples is half an hour. Following the same assumptions in previous papers [11, 17], the browsers were 
configured appropriately (no caching, no automatic update checks and no unnecessary plugins). This makes 
our results comparable with previous work. 

4.3 Website Detection 

We first analyze the ability of an attacker to detect whether a user visits a particular site. To do so, the attacker 
checks whether the distance between the user trace and the target web site is smaller than some threshold, 
and if so, the web site is considered detected. This is a binary classification task and its performance can 
be characterized by the rates of false positives — a different website incorrectly identified as the target-and 

7 http : //www . mo z ilia . com/f iref ox/ 
8 http : / /www . hping . org 
9 http : / /www . tcpdump . org 
10 http : / /www . alexa . com 
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false negatives-the target website not being identified. The choice of threshold t creates a tradeoff between 
the two rates: a smaller threshold will decrease false positives at the expense of false negatives. 

To estimate false-positive rate given a particular threshold v, we fix a target site t and use the 12 samples 
T = {st,i, • • • , St,i2} as the training set. We use the samples from the other sites as a test set; i.e., U = {sij} 
for i ^ t, j G {1, . . . , 12}. Given a sample Sij E U, we calculate the average distance from it to the training 
samples: 

dt{Sij) = EEx (4) 

where £)(•,•) is the DTW-based distance function defined in §3.1. We then consider every sample Sij E C7 
such that dt(sij) < v to be a false positive and therefore estimate the false-positive rate: 

To estimate the false-negative rate, we pick one of the sample s t ^ and calculate its average distance to 
the other 11 samples stj,j ^ i, and count it as a false negative if the distance is at least v. We then repeat 
this process for each i = 1, . . . , 12: 



Qt = 



{s tji \i E {1, . . . , 12}, Zi^j d(s t ,i, s t j)/U > v} 



(6) 



12 

Given a target false positive rate of p%, we can calculate the threshold v\ that would ensure p t < p$- 
Note that because p t is only an estimate of p t , we calculate a 95% confidence interval for p t and chose v 
such that the upper limit of the CI is below p\ n . Note that this threshold will be different for each site. We 
can then estimate the corresponding false negative rate q% that corresponds to 

The target false positive rate will largely depend on the prior knowledge the attacker has. Typically, we 
will want to aim for a small false-positive rate, since even if Bob considers it likely that Alice does in fact 
visit the target site t at some point, most of the web browsing in any trace will still be to other sites; thus a 
low false-positive rate is needed for the test to have high positive predictive value. On the other hand, Bob 
can easily tolerate a moderate false-negative rate, since even if he only finds out about employees searching 
for other jobs 90%, or even 50% of the time, this information is useful nevertheless. Likewise, perfect 
detection is not needed for the potential attack to have a chilling effect on Alice's behavior. 

Figure 7 shows the false negative rates that can be achieved given a target false-positive rate of 0.5%, 
1%, and 5%. Each bar represents a cumulative number of websites, i.e., websites for which q% is at or 
below the x-axis value. We show two sets of results; one using the VM setup for training and DSL for 
testing (Figure 7(a)) and one using the DSL samples for both training and test data sets (Figure 7(b)). Note 
that there is a significant difference between the two graphs, resulting from the discrepancies between the 
simulated (VM) and the test environment. We expect that, with some work, an attacker may be able to reduce 
such discrepancies by more carefully tuning the parameters of the virtual machine and the simulated link, 
or by using actual hardware and a real DSL line that mimics Alice's setup. The DSL-DSL case therefore 
shows the limits of what can be achieved by improving the training environment. 

An important observation is that, in both cases, the success of the web detection is highly dependent on 
the target site. For a small number of sites — 75 in the VM-DSL case and 320 in the DSL-DSL case — the 
web detection attack works very well: we are able to maintain a very low false-positive rate of 0.5% while 
experiencing few false negatives (17% or below). On the other hand, some sites are virtually invulnerable 



11 We use a binomial proportion confidence interval here. This is slightly imprecise, as the 12 • 999 samples are not independent; 
we leave computation of confidence intervals that take this into account for future work. 
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(a) VM-DSL case (b) DSL-DSL case 

Figure 7: Number of sites with a given false negative rateor smaller. 



to our attack: for 65 of the sites we tested, we were unable to observe any true positives with a target false- 
positive rate of 5% (i.e., q% = 1), even in the best-case DSL-DSL scenario. We found these sites to have 
either very short traces, making it difficult to distinguish them from other such sites, or highly variable traffic 
patterns due, for example, to dynamic content, making it difficult to create a useful fingerprint. 

4.4 Deanonymization 

We next consider the deanonymization attack described in §3.3. As a case study, we considered the site 
www.warriorforum.com, a popular Internet marketing forum. It uses the vBulletin software, which was, as 
of August 2011, the most popular bulletin board software 12 , and thus should be representative of a number 
of other sites. In our attack scenario, Bob wishes to find out if Alice is using a particular pseudonym (say, 
"dianel23") to post on the site. To accomplish this, he first collects traces from Alice's home computer for 
a period of time. He then waits for posts to the forum from dianel23 and performs a detection attack to see 
if Alice was visiting the site at the time of post. Repeated successful matches can then be used to obtain 
increasing confidence in tying Alice to dianel23. 

Note that Bob will need to build a profile that 
targets internal pages of www. warriorf orum. 
com, rather than the front page. Alice's post 
requests will be too small to create an easily- 
observable feature; however, vBulletin displays the 
forum thread after a post has been made. Therefore, 
Bob can collect samples visiting threads where di- 
anel23 has posted to create a fingerprint. Note that, 
in this attack, fingerprint creation happens after the 
trace collection. A problem facing Bob is that dif- 
ferent post pages on Warrior Forum will have simi- 
lar features in their RTT profile. A match, therefore, 
can show that Alice visited some Warrior Forum 
page with high confidence, but it may not have been 
the correct thread. Even this information, however, 

is likely to be enough for deanonymization. For example, Alexa shows that fewer than 1% of US Internet 
users actually visit the Warrior Forum, and those that do tend to stay on the site less than 10 minutes on av- 




0.015 0.02 0.025 0.03 0.035 0.04 
Upper Bound For False Positive Rate 



Figure 8: Detection performance for Warrior Forum 
when training with the correct page and when using 
other pages. 



http : / /www . big- boards . com/ statist ics / 
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erage. If we make the simplifying assumptions that these visits are distributed randomly across a three-hour 
evening period, the additional false-positive rate due to random visits to other Warrior Forum pages is no 
more than 6%, even if Alice is known to be a Warrior Forum user. Combining observations across several 
posts allows Bob to improve his confidence. 

Over a long term, even simpler attacks may suffice: since most people's Internet usage is bursty, simply 
observing that Alice always actively used the Internet in some way whenever dianel23 made posts can be 
used for deanonymization [ ]. Likewise, Bob may be able to rule out Alice as a suspect if she was known 
to be at home (due to recent DSL activity) but her connection was idle at the times of the target posts. 

Finally, Bob may be able to use the similarity between internal forum pages to his advantage. In partic- 
ular, suppose that Alice publicly participates in the forum under her real identity, in addition to potentially 
posting under a pseudonym. Bob can use the times of Alice's posts under her real name to label traces 
collected from Alice's computer and create a training set. In this case, Bob does not need to simulate Alice's 
computing environment as the training and test environments are exactly the same — the ideal conditions 
we used in the DSL-DSL case. To study this attack, we collected samples from 100 different posts on the 
Warrior Forum site. For each sample, we attempt to match it to a fingerprint created from the other 99 posts; 
our process is similar to (6), except using a different for each sample. From this, we estimate the false nega- 
tive rate for a given target false-positive rate, calculated using (5), using the traces from 999 other websites. 
Figure 8 shows the results. The use of different pages to test degrades the matching performance, but it still 
provides sufficient detection power for deanonymization after a few posts. 

5 Discussion 

In this section, we discuss about some limitations of our work. 

1. Multiple users. In the scenario of our remote traffic analysis, the attacker's probes cannot distinguish 
between the traffic of multiple users on the same link, so shared broadband connections present an 
obstacle to our attack. However, even in multi-user installations, it is still common for only one user 
to be using the Internet at any given point during the day. Some previous work on traffic analysis has 
used blind source separation to separate traffic from multiple users [ ]; similar techniques may be 
applicable here. For example, in Figure 2, traffic follows a periodic pattern based on the RTT between 
Alice and the website; such periodicity might help separate the sources. 

2. Dynamic nature of websites. Our attack relies on web sites having relatively stable fingerprints. Al- 
though the overall pattern captured by our RTT probes remains static enough within days, the website 
content may incur significant changes (e.g., site redesigns) over time; which in turn will result in a 
change of its fingerprint. Thus, for best results, the training set should be updated continuously. This 
limitation applies to any website fingerprinting approach even local website fingerprinting techniques 
which benefit from better vantage points [1 1, 17]. 

3. Content distribution networks. Websites that use content distribution networks (CDNs) will use dif- 
ferent servers to deliver content based on the user's location. They may present localized versions 
of the site to users in different countries or regions. As shown in our experimental results, this can 
cause fingerprints to differ significantly. If identifying these sites is a high priority for the attacker, 
additional work would be needed to obtain fingerprints of the right version by, for example, using 
proxies and other techniques to fool IP-based localization. 

4. Cache issues In our tests, we followed the assumption in previous work [11, 17] and disabled the 
cache in the browser. This implies that our results demonstrates the attacker's ability to verify that a 
user visits a web page for the first time. To investigate cases with cache enabled, one possible solution 
would be build separate fingerprints based on the time since the site was first downloaded, e.g., after 
1 hour, 6 hours, 1 day, 1 week, to minimize the effect that caching would have on the attack. Note 
that with continuous observation of a computer, the attacker may be able to guess how long ago the 
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last visit was. 



6 Related Work 

The use of network probes to infer information about traffic at a remote location has been explored in previ- 
ous work in the context of anonymous communication networks. Murdoch and Danezis used a remote traffic 
analysis approach to expose the identity of relays participating in a circuit in the Tor [ ] and MorphMix [21] 
anonymous communication networks [ ]. Their approach was to send an on-off pattern of high- volume 
traffic through the anonymous tunnel and a low- volume probe to a router under test. If the waiting times of 
the probe showed a corresponding increase during the "on" periods, the router was assumed to be routing the 
flow. However, when Murdoch and Danezis evaluated their attack, the Tor network was lightly loaded and 
consisted only of 13 relays; to repeat their attack on today's network, with around 2 000 relays and high traf- 
fic load 13 , an attacker would needs extremely large amounts of bandwidth to measure enough relays during 
the attack window. Evans et al. [9] strengthened Murdoch and Danezis's attack of by a bandwidth amplifi- 
cation attack which make their attack feasible in modern-day deployment of Tor. Hopper et al. [13, 14] use 
a combination of Murdoch and Danezis's approach and pairwise round trip times (RTTs) between Internet 
nodes to correlate Tor nodes to likely clients. Chakravarty et al. [ ] propose an attack for exposing Tor re- 
lays participating in a circuit of interest by modulating the bandwidth of an anonymous connection and then 
using available bandwidth estimation to observe this pattern as it propagates through the Tor network. Note 
that these techniques relied on detecting a specially-crafted coarse-grained communication pattern, whereas 
our attacks make use of fine-grained information obtained through remote traffic analysis. 

We also survey previous work on recovering information about encrypted HTTP traffic. The fact that 
object sizes could be used to infer sensitive information, even after encryption, was first mentioned by 
Yee (as related by Wagner and Schneier [30]). A specific concern listed by Yee is that the particular page 
within a site accessed by the user could be revealed by considering URL and object lengths. Chen et al. [4] 
applied this observation to AJAX applications to recover detailed information about the internal state of the 
application and users' data. 

Cheng et al. [5] present the earliest implementation of website fingerprinting. The classification features 
used in their scheme are the object sizes and the HTML file sizes. Hintz [12] and Sun et al. [ ] both consider 
website fingerprinting attacks in SSL-encrypted HTTP connections. Their classification features are object 
sizes and counts. While Hintz did not present implementation details and experimental results, Sun et al. 
use a Jaccard's coefficient based classifier and show that their attack can achieve a correct identification rate 
of 75%. 

Instead of looking at web objects, Bissias et al. [ ], Liberatore et al. [ ], and Herrmann et al. [ ] study 
the statistical characteristics of individual packets in the traffic flows. Bissias et al. use packet sizes and inter- 
arrival timings as classification features. Their method is fragile to the changes in the network environment, 
as the inter-arrival timing is highly dependent on the specific routing path and varies from time to time. To 
address this problem, Liberatore et al. only use packet sizes and counts in classification. They implement 
both Jaccard coefficient and Naive Bayes classifier, and show the efficacy of the attack in practice. Using 
similar scheme, Herrmann et al. further improve the classification accuracy using Multinomial Naive Bayes 
classifier. 

7 Conclusion 

We show that traffic analysis attacks can be carried out remotely, without access to the analyzed traffic, 
thus greatly increasing the attack surface and lowering the barrier to entry for conducting the attack. We 
identify a queuing side channel that can be used to infer the queue size of a given link with good accuracy 
and thus monitor traffic patterns. We show how this channel can be used to carry out a remote attack to 

13 See http : / /tor status . blutmagie . de (retrieved November 2010) 
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detect a remote user's browsing patterns. This highlights the importance of traffic analysis attacks in today's 
connected Internet. 
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